Goto

Collaborating Authors

 approximation function



IPTQ-ViT: Post-Training Quantization of Non-linear Functions for Integer-only Vision Transformers

Kim, Gihwan, Lee, Jemin, Kim, Hyungshin

arXiv.org Artificial Intelligence

Previous Quantization-Aware Training (QAT) methods for vision transformers rely on expensive retraining to recover accuracy loss in non-linear layer quantization, limiting their use in resource-constrained environments. In contrast, existing Post-Training Quantization (PTQ) methods either partially quantize non-linear functions or adjust activation distributions to maintain accuracy but fail to achieve fully integer-only inference. In this paper, we introduce IPTQ-ViT, a novel PTQ framework for fully integer-only vision transformers without retraining. We present approximation functions: a polynomial-based GELU optimized for vision data and a bit-shifting-based Softmax designed to improve approximation accuracy in PTQ. In addition, we propose a unified metric integrating quantization sensitivity, perturbation, and computational cost to select the optimal approximation function per activation layer. IPTQ-ViT outperforms previous PTQ methods, achieving up to 6.44\%p (avg. 1.78\%p) top-1 accuracy improvement for image classification, 1.0 mAP for object detection. IPTQ-ViT outperforms partial floating-point PTQ methods under W8A8 and W4A8, and achieves accuracy and latency comparable to integer-only QAT methods. We plan to release our code https://github.com/gihwan-kim/IPTQ-ViT.git.


Multimodal Generative Learning Utilizing Jensen-Shannon-Divergence

Neural Information Processing Systems

Learning from different data types is a long-standing goal in machine learning research, as multiple information sources co-occur when describing natural phenomena.


Situation Model of the Transport, Transport Emissions and Meteorological Conditions

Benes, V., Svitek, M., Michalikova, A., Melicherik, M.

arXiv.org Artificial Intelligence

Air pollution in cities and the possibilities of reducing this pollution represents one of the most important factors that today's society has to deal with. This paper focuses on a systemic approach to traffic emissions with their relation to meteorological conditions, analyzing the effect of weather on the quantity and dispersion of traffic emissions in a city. Using fuzzy inference systems (FIS) the model for prediction of changes in emissions depending on various conditions is developed. The proposed model is based on traffic, meteorology and emission data measured in Prague, Czech Republic. The main objective of the work is to provide insight into how urban planners and policymakers can plan and manage urban transport more effectively with environmental protection in mind.


"KAN you hear me?" Exploring Kolmogorov-Arnold Networks for Spoken Language Understanding

Koudounas, Alkis, La Quatra, Moreno, Pastor, Eliana, Siniscalchi, Sabato Marco, Baralis, Elena

arXiv.org Artificial Intelligence

Kolmogorov-Arnold Networks (KANs) have recently emerged as a promising alternative to traditional neural architectures, yet their application to speech processing remains under explored. This work presents the first investigation of KANs for Spoken Language Understanding (SLU) tasks. We experiment with 2D-CNN models on two datasets, integrating KAN layers in five different configurations within the dense block. The best-performing setup, which places a KAN layer between two linear layers, is directly applied to transformer-based models and evaluated on five SLU datasets with increasing complexity. Our results show that KAN layers can effectively replace the linear layers, achieving comparable or superior performance in most cases. Finally, we provide insights into how KAN and linear layers on top of transformers differently attend to input regions of the raw waveforms.


Silver Linings in the Shadows: Harnessing Membership Inference for Machine Unlearning

Sula, Nexhi, Kumar, Abhinav, Hou, Jie, Wang, Han, Tourani, Reza

arXiv.org Artificial Intelligence

With the continued advancement and widespread adoption of machine learning (ML) models across various domains, ensuring user privacy and data security has become a paramount concern. In compliance with data privacy regulations, such as GDPR, a secure machine learning framework should not only grant users the right to request the removal of their contributed data used for model training but also facilitates the elimination of sensitive data fingerprints within machine learning models to mitigate potential attack - a process referred to as machine unlearning. In this study, we present a novel unlearning mechanism designed to effectively remove the impact of specific data samples from a neural network while considering the performance of the unlearned model on the primary task. In achieving this goal, we crafted a novel loss function tailored to eliminate privacy-sensitive information from weights and activation values of the target model by combining target classification loss and membership inference loss. Our adaptable framework can easily incorporate various privacy leakage approximation mechanisms to guide the unlearning process. We provide empirical evidence of the effectiveness of our unlearning approach with a theoretical upper-bound analysis through a membership inference mechanism as a proof of concept. Our results showcase the superior performance of our approach in terms of unlearning efficacy and latency as well as the fidelity of the primary task, across four datasets and four deep learning architectures.


AnyLoss: Transforming Classification Metrics into Loss Functions

Han, Doheon, Moniz, Nuno, Chawla, Nitesh V

arXiv.org Artificial Intelligence

Many evaluation metrics can be used to assess the performance of models in binary classification tasks. However, most of them are derived from a confusion matrix in a non-differentiable form, making it very difficult to generate a differentiable loss function that could directly optimize them. The lack of solutions to bridge this challenge not only hinders our ability to solve difficult tasks, such as imbalanced learning, but also requires the deployment of computationally expensive hyperparameter search processes in model selection. In this paper, we propose a general-purpose approach that transforms any confusion matrix-based metric into a loss function, \textit{AnyLoss}, that is available in optimization processes. To this end, we use an approximation function to make a confusion matrix represented in a differentiable form, and this approach enables any confusion matrix-based metric to be directly used as a loss function. The mechanism of the approximation function is provided to ensure its operability and the differentiability of our loss functions is proved by suggesting their derivatives. We conduct extensive experiments under diverse neural networks with many datasets, and we demonstrate their general availability to target any confusion matrix-based metrics. Our method, especially, shows outstanding achievements in dealing with imbalanced datasets, and its competitive learning speed, compared to multiple baseline models, underscores its efficiency.


Digital Twin and Artificial Intelligence Incorporated With Surrogate Modeling for Hybrid and Sustainable Energy Systems

Khan, Abid Hossain, Omar, Salauddin, Mushtary, Nadia, Verma, Richa, Kumar, Dinesh, Alam, Syed

arXiv.org Artificial Intelligence

Surrogate modeling has brought about a revolution in computation in the branches of science and engineering. Backed by Artificial Intelligence, a surrogate model can present highly accurate results with a significant reduction in computation time than computer simulation of actual models. Surrogate modeling techniques have found their use in numerous branches of science and engineering, energy system modeling being one of them. Since the idea of hybrid and sustainable energy systems is spreading rapidly in the modern world for the paradigm of the smart energy shift, researchers are exploring the future application of artificial intelligence-based surrogate modeling in analyzing and optimizing hybrid energy systems. One of the promising technologies for assessing applicability for the energy system is the digital twin, which can leverage surrogate modeling. This work presents a comprehensive framework/review on Artificial Intelligence-driven surrogate modeling and its applications with a focus on the digital twin framework and energy systems. The role of machine learning and artificial intelligence in constructing an effective surrogate model is explained. After that, different surrogate models developed for different sustainable energy sources are presented. Finally, digital twin surrogate models and associated uncertainties are described.


On regression analysis with Pad\'e approximants

Yevkin, Glib, Yevkin, Olexandr

arXiv.org Machine Learning

The advantages and difficulties of application of Pad\'e approximants to two-dimensional regression analysis are discussed. New formulation of residuals is suggested in the method of least squares. It leads to a system of linear equations in case of rational functions. The possibility of using Tikhonov regularization technique to avoid overfitting is demonstrated in this approach. To illustrate the efficiency of the suggested method, several practical cases from physics and reliability theory are considered.


[Explained] Machine Learning Fundamentals: Optimization Problems and How to Solve Them

#artificialintelligence

If you start to look into machine learning and the math behind it, you will quickly notice that everything comes down to an optimization problem. Even the training of neural networks is basically just finding the optimal parameter configuration for a really high dimensional function. In this article, we will go through the steps of solving a simple Machine Learning problem step by step. We will see why and how it always comes down to an optimization problem, which parameters are optimized and how we compute the optimal value in the end. To start, let's have a look at a simple dataset (x1, x2): If you are lucky, one computer in the dataset had the exactly same age as your, but that's highly unlikely.